Bengali Verb Subcategorization Frame Acquisition - A Baseline Model
نویسندگان
چکیده
Acquisition of verb subcategorization frames is important as verbs generally take different types of relevant arguments associated with each phrase in a sentence in comparison to other parts of speech categories. This paper presents the acquisition of different subcategorization frames for a Bengali verb Kara (do). It generates compound verbs in Bengali when combined with various noun phrases. The main hypothesis here is that the subcategorization frames for a Bengali verb are same with the subcategorization frames for its equivalent English verb with an identical sense tag. Syntax plays the main role in the acquisition of Bengali verb subcategorization frames. The output frames of the Bengali verbs have been compared with the frames of the equivalent English verbs identified using a Bengali-English bilingual lexicon. The flexible ordering of different phrases, additional attachment of optional phrases in Bengali sentences make this frames acquisition task challenging. This system has demonstrated precision and recall values of 77.11% and 88.23% respectively on a test set of 100 sentences.
منابع مشابه
Acquiring Verb Subcategorization Frames in Bengali from Corpora
Subcategorization frames acquisition of a phrase can be described as a mechanism to extract different types of relevant arguments that are associated with that phrase in a sentence. This paper presents the acquisition of different subcategory frames for a specific Bengali verb that has been identified from POS tagged and chunked data prepared from raw Bengali news corpus. Syntax plays the main ...
متن کاملThe Automatic Acquisition Of Frequencies Of Verb Subcategorization Frames From Tagged Corpora
We describe a mechanism for automatically acquiring verb subcategorization frames and their frequencies in a large corpus. A tagged corpus is first partially parsed to identify noun phrases and then a finear grammar is used to estimate the appropriate subcategorization frame for each verb token in the corpus. In an experiment involving the identification of six fixed subcategorization frames, o...
متن کاملFinding Emotion Holder from Bengali Blog Texts---An Unsupervised Syntactic Approach
This paper presents two different approaches for identifying emotion holders from Bengali blog sentences. Two types of strategies yield average agreement measures of 0.78 and 0.80 for annotating emotion holders with respect to all emotion classes. The baseline model is developed based on the combinations of various part-of-speech (POS) features extracted from the phrase-based similarities. The ...
متن کاملA Connectionist Model of Verb Subcategorization
Much of the debate on rule-based vs. connectionist models in language acquisition has focussed on the English past tense. This paper investigates a new area, the acquisition of verb subcategorization. Verbs differ in how they express their arguments or subcategorize for them. For example, “She gave him a book.” is good, but “She donated him a book.” sounds odd. The paper describes a connectioni...
متن کاملLearning Subcategorization
A method to identify the subcategorized constituents of a verb (its complements) automatically in a sentence is useful in various areas of Natural Language Processing (e.g. automatic acquisition of subcategorization lexicons, parsing, acquisition of verb semantics, information retrieval). I will describe a method for subcategorization identification that uses memorybased learning. Train and tes...
متن کامل